DevOps / Site Reliability Engineer | up to 150K SGD per year

Second Talent • Full-time • Singapore, SG • 1w ago

Company Summary

We are hiring for an innovative and dynamic technology company dedicated to building robust and scalable solutions. Our team is at the forefront of developing and maintaining cutting-edge infrastructure that supports a diverse range of business units. We are passionate about using automation and best-in-class engineering practices to ensure our systems are reliable, efficient, and perform at the highest level.

Job Description

We are looking for a highly skilled and motivated individual to join our team as an infrastructure engineer. In this role, you will be responsible for managing and optimizing our core infrastructure and services. You will focus on maintaining and enhancing container clusters like Kubernetes and Docker, as well as other open-source component clusters such as Kafka, Redis, and Elasticsearch. A key part of your role will be to ensure the performance, scalability, and reliability of our distributed systems.

You will also play a crucial part in designing and developing our infrastructure operation platforms, including building and maintaining CI/CD pipelines, monitoring and alerting systems, and centralized logging. We're looking for someone who can drive automation initiatives and create self-service tools that boost team productivity. You will lead efforts to ensure maximum uptime for our production services by proactively monitoring and responding to incidents, and you will contribute to our SLA/SLO frameworks and reliability engineering practices.

Job Requirements

Experience & Education

3+ years of hands-on experience in Systems Operations, DevOps, or Site Reliability Engineering (SRE).
A bachelor's degree in Computer Science, Engineering, or a related technical field is preferred.
Singapore-based, no work visa provided

Technical Skills

Strong experience with containerization technologies, including Kubernetes and Docker.
Proficiency in scripting and automation using languages such as Shell or Python.
Proven experience with public cloud platforms like AWS, Azure, or GCP is highly valued.
Hands-on experience operating production-grade container clusters and managing CI/CD pipelines.
Strong understanding of large-scale internet architecture and distributed systems.
Familiarity with common infrastructure components, including Nginx, MySQL, Redis, Kafka, and Elasticsearch.
Experience with infrastructure monitoring, logging, and observability tools.